Spanish Language Processing at University of Maryland: Building Infrastructure for Multilingual Applications
نویسندگان
چکیده
We describe here our construction of lexical resources, tool creation, building of an aligned parallel corpus, and an approach to automatic treebank creation that we have been developing using Spanish data, based on projection of English syntactic dependency information across a parallel corpus.
منابع مشابه
The Spanish DELPH-IN grammar
In this article we present a Spanish grammar implemented in the Linguistic Knowledge Builder system and grounded in the theoretical framework of Head-driven Phrase Structure Grammar. The grammar is being developed in an international multilingual context, the DELPH-IN Initiative, contributing to an open-source repository of software and linguistic resources for various Natural Language Processi...
متن کاملImproving Multilingual Catalog Search Services by Means of Multilingual Thesaurus Disambiguation
Multilinguality is an important aspect for the creation of public services in countries like Spain, with four official languages (Spanish, Catalonian, Basque and Galician), and overall, if these services are aimed for a European audience with a big number of official languages. Thus, an initiative for creating a catalog service at the Spanish or at the European level must take into account the ...
متن کاملHow to Add a New Language on the NLP Map: Building Resources and Tools for Languages with Scarce Resources
Those of us whose mother tongue is not English or are curious about applications involving other languages, often find ourselves in the situation where the tools we require are not available. According to recent studies there are about 7200 different languages spoken worldwide – without including variations or dialects – out of which very few have automatic language processing tools and machine...
متن کاملRapid Building of an ASR System for Under-Resourced Languages Based on Multilingual Unsupervised Training
This paper presents our work on rapid language adaptation of acoustic models based on multilingual cross-language bootstrapping and unsupervised training. We used Automatic Speech Recognition (ASR) systems in the six source languages English, French, German, Spanish, Bulgarian and Polish to build from scratch an ASR system for Vietnamese, an underresourced language. System building was performe...
متن کاملFreeLing 2.1: Five Years of Open-source Language Processing Tools
FreeLing is an open-source multilingual language processing library providing a wide range of language analyzers for several languages. It offers text processing and language annotation facilities to natural language processing application developers, simplifying the task of building those applications. FreeLing is customizable and extensible. Developers can use the default linguistic resources...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001